Overview

Brought to you by YData

Dataset statistics

Number of variables22
Number of observations2785189
Missing cells12252161
Missing cells (%)20.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory430.3 MiB
Average record size in memory162.0 B

Variable types

Numeric10
Unsupported2
Text2
DateTime4
Categorical2
Boolean2

Alerts

IBNR is highly overall correlated with starting_station_IBNRHigh correlation
arrival_delay_m is highly overall correlated with departure_delay_mHigh correlation
departure_delay_m is highly overall correlated with arrival_delay_mHigh correlation
info is highly overall correlated with info_present and 4 other fieldsHigh correlation
info_present is highly overall correlated with info and 1 other fieldsHigh correlation
lat is highly overall correlated with info and 1 other fieldsHigh correlation
long is highly overall correlated with infoHigh correlation
starting_station_IBNR is highly overall correlated with IBNR and 1 other fieldsHigh correlation
transformed_info_message is highly overall correlated with info and 1 other fieldsHigh correlation
zip is highly overall correlated with info and 2 other fieldsHigh correlation
canceled is highly imbalanced (61.2%) Imbalance
transformed_info_message is highly imbalanced (53.9%) Imbalance
last_station has 40796 (1.5%) missing values Missing
IBNR has 135873 (4.9%) missing values Missing
long has 142208 (5.1%) missing values Missing
lat has 142208 (5.1%) missing values Missing
arrival_plan has 1183672 (42.5%) missing values Missing
departure_plan has 972317 (34.9%) missing values Missing
arrival_change has 1422439 (51.1%) missing values Missing
departure_change has 1281468 (46.0%) missing values Missing
arrival_delay_m has 972317 (34.9%) missing values Missing
departure_delay_m has 972317 (34.9%) missing values Missing
info has 2201357 (79.0%) missing values Missing
clear_station_name has 2785189 (100.0%) missing values Missing
line is an unsupported type, check if it needs cleaning or further analysis Unsupported
clear_station_name is an unsupported type, check if it needs cleaning or further analysis Unsupported
arrival_delay_m has 1256668 (45.1%) zeros Zeros
departure_delay_m has 1190749 (42.8%) zeros Zeros

Reproduction

Analysis started2024-11-17 18:57:13.391912
Analysis finished2024-11-17 18:59:41.505396
Duration2 minutes and 28.11 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

ID_Base
Real number (ℝ)

Distinct50013
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2.2534825 × 1016
Minimum-9.223177 × 1018
Maximum9.223057 × 1018
Zeros0
Zeros (%)0.0%
Negative1399190
Negative (%)50.2%
Memory size21.2 MiB
2024-11-17T19:59:41.599082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-9.223177 × 1018
5-th percentile-8.3285499 × 1018
Q1-4.5909339 × 1018
median-4.62402 × 1016
Q34.5625228 × 1018
95-th percentile8.331309 × 1018
Maximum9.223057 × 1018
Range-5.101451 × 1014
Interquartile range (IQR)9.1534567 × 1018

Descriptive statistics

Standard deviation5.3245568 × 1018
Coefficient of variation (CV)-236.28126
Kurtosis-1.19281
Mean-2.2534825 × 1016
Median Absolute Deviation (MAD)4.5737344 × 1018
Skewness0.0086452172
Sum-7.9239271 × 1018
Variance2.8350905 × 1037
MonotonicityNot monotonic
2024-11-17T19:59:41.759188image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.256484864 × 1018413
 
< 0.1%
8.467202706 × 1018413
 
< 0.1%
-7.996941865 × 1018412
 
< 0.1%
8.668076605 × 1018391
 
< 0.1%
-2.094717035 × 1018378
 
< 0.1%
2.688663988 × 1018361
 
< 0.1%
-8.560851479 × 1018350
 
< 0.1%
-1.78380972 × 1017350
 
< 0.1%
1.373163551 × 1018350
 
< 0.1%
7.295358315 × 1018350
 
< 0.1%
Other values (50003) 2781421
99.9%
ValueCountFrequency (%)
-9.223176951 × 101810
 
< 0.1%
-9.222914435 × 10187
 
< 0.1%
-9.222740236 × 10182
 
< 0.1%
-9.222707015 × 10185
 
< 0.1%
-9.222587614 × 101817
 
< 0.1%
-9.222235769 × 101849
 
< 0.1%
-9.221813993 × 1018210
< 0.1%
-9.221685702 × 10187
 
< 0.1%
-9.221229322 × 101825
 
< 0.1%
-9.221103336 × 101898
< 0.1%
ValueCountFrequency (%)
9.223056978 × 10182
 
< 0.1%
9.221732242 × 101884
< 0.1%
9.221398198 × 10187
 
< 0.1%
9.221055243 × 101863
< 0.1%
9.220892138 × 101820
 
< 0.1%
9.22087069 × 101856
 
< 0.1%
9.220854484 × 101814
 
< 0.1%
9.219893508 × 1018144
< 0.1%
9.219684671 × 10185
 
< 0.1%
9.219589171 × 101814
 
< 0.1%

ID_Timestamp
Real number (ℝ)

Distinct10155
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4071086 × 109
Minimum2.4070319 × 109
Maximum2.4071424 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:41.981493image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2.4070319 × 109
5-th percentile2.4070807 × 109
Q12.4070914 × 109
median2.4071109 × 109
Q32.4071223 × 109
95-th percentile2.4071415 × 109
Maximum2.4071424 × 109
Range110502
Interquartile range (IQR)30851

Descriptive statistics

Standard deviation21614.602
Coefficient of variation (CV)8.9794879 × 10-6
Kurtosis-0.0095195568
Mean2.4071086 × 109
Median Absolute Deviation (MAD)19390
Skewness-0.35481539
Sum6.7042524 × 1015
Variance4.6719104 × 108
MonotonicityNot monotonic
2024-11-17T19:59:42.109069image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2407090833 924
 
< 0.1%
2407080833 921
 
< 0.1%
2407100833 909
 
< 0.1%
2407110833 909
 
< 0.1%
2407091633 905
 
< 0.1%
2407081633 899
 
< 0.1%
2407120833 898
 
< 0.1%
2407090733 896
 
< 0.1%
2407080733 895
 
< 0.1%
2407111633 891
 
< 0.1%
Other values (10145) 2776142
99.7%
ValueCountFrequency (%)
2407031857 3
 
< 0.1%
2407040236 24
 
< 0.1%
2407040245 11
 
< 0.1%
2407040253 13
 
< 0.1%
2407040302 19
 
< 0.1%
2407040303 6
 
< 0.1%
2407040312 22
 
< 0.1%
2407040313 37
< 0.1%
2407040314 6
 
< 0.1%
2407040317 67
< 0.1%
ValueCountFrequency (%)
2407142359 6
 
< 0.1%
2407142358 17
< 0.1%
2407142357 3
 
< 0.1%
2407142356 9
 
< 0.1%
2407142355 6
 
< 0.1%
2407142354 8
 
< 0.1%
2407142353 22
< 0.1%
2407142352 10
 
< 0.1%
2407142351 42
< 0.1%
2407142350 29
< 0.1%

stop_number
Real number (ℝ)

Distinct59
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.7907115
Minimum1
Maximum59
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:42.239656image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q314
95-th percentile25
Maximum59
Range58
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.5890828
Coefficient of variation (CV)0.77513088
Kurtosis0.7903732
Mean9.7907115
Median Absolute Deviation (MAD)5
Skewness1.0328565
Sum27268982
Variance57.594178
MonotonicityNot monotonic
2024-11-17T19:59:42.372856image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 257818
 
9.3%
2 206965
 
7.4%
3 196635
 
7.1%
4 187543
 
6.7%
5 175516
 
6.3%
6 162330
 
5.8%
7 150002
 
5.4%
8 139863
 
5.0%
9 130318
 
4.7%
10 120314
 
4.3%
Other values (49) 1057885
38.0%
ValueCountFrequency (%)
1 257818
9.3%
2 206965
7.4%
3 196635
7.1%
4 187543
6.7%
5 175516
6.3%
6 162330
5.8%
7 150002
5.4%
8 139863
5.0%
9 130318
4.7%
10 120314
4.3%
ValueCountFrequency (%)
59 36
 
< 0.1%
58 36
 
< 0.1%
57 36
 
< 0.1%
56 36
 
< 0.1%
55 36
 
< 0.1%
54 45
 
< 0.1%
53 62
< 0.1%
52 62
< 0.1%
51 76
< 0.1%
50 146
< 0.1%

line
Unsupported

Rejected  Unsupported 

Missing0
Missing (%)0.0%
Memory size21.2 MiB

starting_station_IBNR
Real number (ℝ)

High correlation 

Distinct1733
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8016680.2
Minimum8000001
Maximum8098360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:42.518478image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum8000001
5-th percentile8000106
Q18001864
median8004241
Q38010226
95-th percentile8089078
Maximum8098360
Range98359
Interquartile range (IQR)8362

Descriptive statistics

Standard deviation29613.401
Coefficient of variation (CV)0.0036939731
Kurtosis1.8030415
Mean8016680.2
Median Absolute Deviation (MAD)2952
Skewness1.9195194
Sum2.232797 × 1013
Variance8.7695352 × 108
MonotonicityIncreasing
2024-11-17T19:59:42.669058image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8003184 38109
 
1.4%
8089116 35053
 
1.3%
8089111 34104
 
1.2%
8005106 28776
 
1.0%
8006750 26423
 
0.9%
8089078 24706
 
0.9%
8006319 24370
 
0.9%
8089053 23509
 
0.8%
8006404 23150
 
0.8%
8089022 21631
 
0.8%
Other values (1723) 2505358
90.0%
ValueCountFrequency (%)
8000001 181
 
< 0.1%
8000002 693
 
< 0.1%
8000004 3447
0.1%
8000007 821
 
< 0.1%
8000009 897
 
< 0.1%
8000010 1624
0.1%
8000011 71
 
< 0.1%
8000012 1731
0.1%
8000013 2491
0.1%
8000014 359
 
< 0.1%
ValueCountFrequency (%)
8098360 153
 
< 0.1%
8089537 469
 
< 0.1%
8089474 283
 
< 0.1%
8089473 103
 
< 0.1%
8089472 16127
0.6%
8089330 283
 
< 0.1%
8089329 1416
 
0.1%
8089328 10
 
< 0.1%
8089131 99
 
< 0.1%
8089118 235
 
< 0.1%

city
Text

Distinct1173
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:42.945598image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length9.327129
Min length3

Characters and Unicode

Total characters25977817
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowAachen
2nd rowAachen
3rd rowAachen
4th rowAachen
5th rowAachen
ValueCountFrequency (%)
berlin 329373
 
10.1%
hamburg 162071
 
5.0%
am 61714
 
1.9%
main 52698
 
1.6%
münchen 52664
 
1.6%
bad 46686
 
1.4%
frankfurt 42841
 
1.3%
karlsruhe 41627
 
1.3%
düsseldorf 40213
 
1.2%
dortmund 35100
 
1.1%
Other values (1230) 2397081
73.5%
2024-11-17T19:59:43.376989image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3092527
 
11.9%
n 2365863
 
9.1%
r 2222115
 
8.6%
a 1671183
 
6.4%
i 1530071
 
5.9%
l 1133582
 
4.4%
t 993122
 
3.8%
s 958082
 
3.7%
h 953468
 
3.7%
u 941127
 
3.6%
Other values (50) 10116677
38.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 25977817
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3092527
 
11.9%
n 2365863
 
9.1%
r 2222115
 
8.6%
a 1671183
 
6.4%
i 1530071
 
5.9%
l 1133582
 
4.4%
t 993122
 
3.8%
s 958082
 
3.7%
h 953468
 
3.7%
u 941127
 
3.6%
Other values (50) 10116677
38.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 25977817
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3092527
 
11.9%
n 2365863
 
9.1%
r 2222115
 
8.6%
a 1671183
 
6.4%
i 1530071
 
5.9%
l 1133582
 
4.4%
t 993122
 
3.8%
s 958082
 
3.7%
h 953468
 
3.7%
u 941127
 
3.6%
Other values (50) 10116677
38.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 25977817
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3092527
 
11.9%
n 2365863
 
9.1%
r 2222115
 
8.6%
a 1671183
 
6.4%
i 1530071
 
5.9%
l 1133582
 
4.4%
t 993122
 
3.8%
s 958082
 
3.7%
h 953468
 
3.7%
u 941127
 
3.6%
Other values (50) 10116677
38.9%

zip
Real number (ℝ)

High correlation 

Distinct1498
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47364.088
Minimum1067
Maximum99974
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:43.496485image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1067
5-th percentile4838
Q116540
median49074
Q374177
95-th percentile90574
Maximum99974
Range98907
Interquartile range (IQR)57637

Descriptive statistics

Standard deviation28885.385
Coefficient of variation (CV)0.60985836
Kurtosis-1.3897096
Mean47364.088
Median Absolute Deviation (MAD)27153
Skewness0.013123773
Sum1.3191794 × 1011
Variance8.3436547 × 108
MonotonicityNot monotonic
2024-11-17T19:59:43.603401image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
76227 38109
 
1.4%
13353 35053
 
1.3%
14059 34104
 
1.2%
85354 31932
 
1.1%
22559 28776
 
1.0%
21147 26423
 
0.9%
22391 25418
 
0.9%
13597 25406
 
0.9%
14129 24710
 
0.9%
65203 23150
 
0.8%
Other values (1488) 2492108
89.5%
ValueCountFrequency (%)
1067 10181
0.4%
1069 33
 
< 0.1%
1097 589
 
< 0.1%
1109 2723
 
0.1%
1129 879
 
< 0.1%
1159 406
 
< 0.1%
1187 5255
0.2%
1219 245
 
< 0.1%
1237 50
 
< 0.1%
1445 8
 
< 0.1%
ValueCountFrequency (%)
99974 588
 
< 0.1%
99947 672
 
< 0.1%
99880 3499
0.1%
99867 76
 
< 0.1%
99817 296
 
< 0.1%
99752 621
 
< 0.1%
99734 356
 
< 0.1%
99610 1891
0.1%
99518 8
 
< 0.1%
99510 5
 
< 0.1%

last_station
Text

Missing 

Distinct5870
Distinct (%)0.2%
Missing40796
Missing (%)1.5%
Memory size21.2 MiB
2024-11-17T19:59:43.994026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length50
Median length41
Mean length15.764374
Min length3

Characters and Unicode

Total characters43263637
Distinct characters48
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)< 0.1%

Sample

1st rowstolberg(rheinl)hbf gl.44
2nd roweschweiler-st.jöris
3rd rowalsdorf poststraße
4th rowalsdorf-mariadorf
5th rowalsdorf-kellersberg
ValueCountFrequency (%)
berlin 288558
 
6.9%
hbf 185604
 
4.4%
hamburg 93284
 
2.2%
münchen 83457
 
2.0%
s 81299
 
1.9%
bad 34406
 
0.8%
karlsruhe 33118
 
0.8%
stuttgart 29159
 
0.7%
ost 26994
 
0.6%
leipzig 26630
 
0.6%
Other values (5818) 3323763
79.0%
2024-11-17T19:59:44.295919image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5119053
 
11.8%
r 3751615
 
8.7%
n 3516766
 
8.1%
a 2621122
 
6.1%
s 2340810
 
5.4%
h 2337235
 
5.4%
l 2272648
 
5.3%
i 2148228
 
5.0%
t 2082827
 
4.8%
b 2028706
 
4.7%
Other values (38) 15044627
34.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43263637
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 5119053
 
11.8%
r 3751615
 
8.7%
n 3516766
 
8.1%
a 2621122
 
6.1%
s 2340810
 
5.4%
h 2337235
 
5.4%
l 2272648
 
5.3%
i 2148228
 
5.0%
t 2082827
 
4.8%
b 2028706
 
4.7%
Other values (38) 15044627
34.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43263637
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 5119053
 
11.8%
r 3751615
 
8.7%
n 3516766
 
8.1%
a 2621122
 
6.1%
s 2340810
 
5.4%
h 2337235
 
5.4%
l 2272648
 
5.3%
i 2148228
 
5.0%
t 2082827
 
4.8%
b 2028706
 
4.7%
Other values (38) 15044627
34.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43263637
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 5119053
 
11.8%
r 3751615
 
8.7%
n 3516766
 
8.1%
a 2621122
 
6.1%
s 2340810
 
5.4%
h 2337235
 
5.4%
l 2272648
 
5.3%
i 2148228
 
5.0%
t 2082827
 
4.8%
b 2028706
 
4.7%
Other values (38) 15044627
34.8%

IBNR
Real number (ℝ)

High correlation  Missing 

Distinct5264
Distinct (%)0.2%
Missing135873
Missing (%)4.9%
Infinite0
Infinite (%)0.0%
Mean8019176.6
Minimum8000001
Maximum8099506
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:44.416864image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum8000001
5-th percentile8000191
Q18002047
median8004440
Q38011426
95-th percentile8089090
Maximum8099506
Range99505
Interquartile range (IQR)9379

Descriptive statistics

Standard deviation32023.558
Coefficient of variation (CV)0.0039933723
Kurtosis0.92620159
Mean8019176.6
Median Absolute Deviation (MAD)2945
Skewness1.6805098
Sum2.1245333 × 1013
Variance1.0255083 × 109
MonotonicityNot monotonic
2024-11-17T19:59:44.569086image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8089028 10583
 
0.4%
8098549 8892
 
0.3%
8004128 8825
 
0.3%
8089015 7451
 
0.3%
8004132 7446
 
0.3%
8004135 7444
 
0.3%
8004129 7444
 
0.3%
8098263 7442
 
0.3%
8004131 7441
 
0.3%
8004136 7430
 
0.3%
Other values (5254) 2568918
92.2%
(Missing) 135873
 
4.9%
ValueCountFrequency (%)
8000001 614
< 0.1%
8000002 118
 
< 0.1%
8000004 853
< 0.1%
8000007 473
< 0.1%
8000009 530
< 0.1%
8000010 660
< 0.1%
8000011 593
< 0.1%
8000012 587
< 0.1%
8000013 1099
< 0.1%
8000014 679
< 0.1%
ValueCountFrequency (%)
8099506 225
 
< 0.1%
8098553 5223
0.2%
8098549 8892
0.3%
8098360 33
 
< 0.1%
8098348 225
 
< 0.1%
8098263 7442
0.3%
8098205 3097
 
0.1%
8098193 542
 
< 0.1%
8098147 3084
 
0.1%
8098105 5725
0.2%

long
Real number (ℝ)

High correlation  Missing 

Distinct3184
Distinct (%)0.1%
Missing142208
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean10.148847
Minimum0.834032
Maximum14.982644
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:44.702788image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0.834032
5-th percentile6.851719
Q18.364945
median9.902741
Q312.130664
95-th percentile13.54746
Maximum14.982644
Range14.148612
Interquartile range (IQR)3.765719

Descriptive statistics

Standard deviation2.2967626
Coefficient of variation (CV)0.22630774
Kurtosis-1.1021461
Mean10.148847
Median Absolute Deviation (MAD)1.755656
Skewness0.13274197
Sum26823209
Variance5.2751183
MonotonicityNot monotonic
2024-11-17T19:59:44.844599image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.536537 8022
 
0.3%
13.283966 7881
 
0.3%
11.575386 7373
 
0.3%
11.583234 7368
 
0.3%
11.548572 7363
 
0.3%
11.565619 7329
 
0.3%
11.593049 6923
 
0.2%
11.604971 6636
 
0.2%
11.519245 6549
 
0.2%
13.451646 6508
 
0.2%
Other values (3174) 2571029
92.3%
(Missing) 142208
 
5.1%
ValueCountFrequency (%)
0.834032 709
< 0.1%
0.896632 730
< 0.1%
6.070715 1431
0.1%
6.07384 897
< 0.1%
6.074485 1049
< 0.1%
6.08378 734
< 0.1%
6.091499 1483
0.1%
6.094486 1286
< 0.1%
6.097265 810
< 0.1%
6.098877 740
< 0.1%
ValueCountFrequency (%)
14.982644 721
< 0.1%
14.97908 421
< 0.1%
14.936008 1
 
< 0.1%
14.930408 720
< 0.1%
14.902088 248
 
< 0.1%
14.889318 698
< 0.1%
14.825531 666
< 0.1%
14.825234 764
< 0.1%
14.805774 269
 
< 0.1%
14.706775 299
 
< 0.1%

lat
Real number (ℝ)

High correlation  Missing 

Distinct3191
Distinct (%)0.1%
Missing142208
Missing (%)5.1%
Infinite0
Infinite (%)0.0%
Mean50.950548
Minimum47.411032
Maximum55.021381
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:44.986016image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum47.411032
5-th percentile48.043452
Q149.379389
median51.037414
Q352.493827
95-th percentile53.918621
Maximum55.021381
Range7.610349
Interquartile range (IQR)3.114438

Descriptive statistics

Standard deviation1.9056685
Coefficient of variation (CV)0.037402316
Kurtosis-0.95636919
Mean50.950548
Median Absolute Deviation (MAD)1.479092
Skewness0.0059574693
Sum1.3466133 × 108
Variance3.6315725
MonotonicityNot monotonic
2024-11-17T19:59:45.135640image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48.142623 8022
 
0.3%
52.500737 7881
 
0.3%
48.137048 7373
 
0.3%
48.134202 7368
 
0.3%
48.141969 7363
 
0.3%
48.139452 7329
 
0.3%
48.129168 6923
 
0.2%
48.12744 6636
 
0.2%
48.14354 6549
 
0.2%
52.505976 6508
 
0.2%
Other values (3181) 2571029
92.3%
(Missing) 142208
 
5.1%
ValueCountFrequency (%)
47.411032 221
 
< 0.1%
47.4179544 705
< 0.1%
47.44003 188
 
< 0.1%
47.456591 214
 
< 0.1%
47.491452 398
 
< 0.1%
47.5058367 1419
0.1%
47.513241 434
 
< 0.1%
47.5251713 720
< 0.1%
47.543785 719
< 0.1%
47.544341 515
 
< 0.1%
ValueCountFrequency (%)
55.021381 731
< 0.1%
55.019862 713
< 0.1%
55.017947 748
< 0.1%
55.01765 718
< 0.1%
55.0149 734
< 0.1%
55.012455 725
< 0.1%
55.010432 736
< 0.1%
55.008077 709
< 0.1%
55.001937 726
< 0.1%
54.988543 765
< 0.1%

arrival_plan
Date

Missing 

Distinct10081
Distinct (%)0.6%
Missing1183672
Missing (%)42.5%
Memory size21.2 MiB
Minimum2024-07-07 23:37:00
Maximum2024-07-14 23:58:00
2024-11-17T19:59:45.271790image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:45.397550image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

departure_plan
Date

Missing 

Distinct10080
Distinct (%)0.6%
Missing972317
Missing (%)34.9%
Memory size21.2 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-14 23:59:00
2024-11-17T19:59:45.520554image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:45.670886image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_change
Date

Missing 

Distinct10112
Distinct (%)0.7%
Missing1422439
Missing (%)51.1%
Memory size21.2 MiB
Minimum2024-07-07 23:39:00
Maximum2024-07-15 01:00:00
2024-11-17T19:59:45.805742image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:45.938875image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

departure_change
Date

Missing 

Distinct10107
Distinct (%)0.7%
Missing1281468
Missing (%)46.0%
Memory size21.2 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-15 01:00:00
2024-11-17T19:59:46.080137image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:46.216040image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_delay_m
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct110
Distinct (%)< 0.1%
Missing972317
Missing (%)34.9%
Infinite0
Infinite (%)0.0%
Mean1.1088477
Minimum0
Maximum159
Zeros1256668
Zeros (%)45.1%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:46.355202image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.2636363
Coefficient of variation (CV)2.9432682
Kurtosis110.72395
Mean1.1088477
Median Absolute Deviation (MAD)0
Skewness7.8255256
Sum2010199
Variance10.651322
MonotonicityNot monotonic
2024-11-17T19:59:46.495727image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1256668
45.1%
1 218485
 
7.8%
2 113139
 
4.1%
3 69418
 
2.5%
4 38750
 
1.4%
5 26192
 
0.9%
6 18044
 
0.6%
7 12782
 
0.5%
8 10071
 
0.4%
9 8127
 
0.3%
Other values (100) 41196
 
1.5%
(Missing) 972317
34.9%
ValueCountFrequency (%)
0 1256668
45.1%
1 218485
 
7.8%
2 113139
 
4.1%
3 69418
 
2.5%
4 38750
 
1.4%
5 26192
 
0.9%
6 18044
 
0.6%
7 12782
 
0.5%
8 10071
 
0.4%
9 8127
 
0.3%
ValueCountFrequency (%)
159 1
 
< 0.1%
157 1
 
< 0.1%
140 1
 
< 0.1%
136 1
 
< 0.1%
134 1
 
< 0.1%
133 2
 
< 0.1%
120 1
 
< 0.1%
117 1
 
< 0.1%
116 1
 
< 0.1%
110 7
< 0.1%

departure_delay_m
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct113
Distinct (%)< 0.1%
Missing972317
Missing (%)34.9%
Infinite0
Infinite (%)0.0%
Mean1.1712868
Minimum0
Maximum159
Zeros1190749
Zeros (%)42.8%
Negative0
Negative (%)0.0%
Memory size21.2 MiB
2024-11-17T19:59:46.917569image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.3009847
Coefficient of variation (CV)2.8182549
Kurtosis109.1751
Mean1.1712868
Median Absolute Deviation (MAD)0
Skewness7.7631659
Sum2123393
Variance10.8965
MonotonicityNot monotonic
2024-11-17T19:59:47.039168image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1190749
42.8%
1 264731
 
9.5%
2 128372
 
4.6%
3 70583
 
2.5%
4 39753
 
1.4%
5 26517
 
1.0%
6 18206
 
0.7%
7 12989
 
0.5%
8 10239
 
0.4%
9 8183
 
0.3%
Other values (103) 42550
 
1.5%
(Missing) 972317
34.9%
ValueCountFrequency (%)
0 1190749
42.8%
1 264731
 
9.5%
2 128372
 
4.6%
3 70583
 
2.5%
4 39753
 
1.4%
5 26517
 
1.0%
6 18206
 
0.7%
7 12989
 
0.5%
8 10239
 
0.4%
9 8183
 
0.3%
ValueCountFrequency (%)
159 1
< 0.1%
156 1
< 0.1%
137 1
< 0.1%
135 1
< 0.1%
134 2
< 0.1%
133 1
< 0.1%
132 1
< 0.1%
120 1
< 0.1%
117 1
< 0.1%
115 1
< 0.1%

info
Categorical

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing2201357
Missing (%)79.0%
Memory size21.2 MiB
Information
219535 
Störung
105210 
Bauarbeiten
88958 
Information. (Quelle: zuginfo.nrw)
73213 
Bauarbeiten. (Quelle: zuginfo.nrw)
64167 
Other values (2)
32749 

Length

Max length34
Median length11
Mean length16.518603
Min length7

Characters and Unicode

Total characters9644089
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStörung. (Quelle: zuginfo.nrw)
2nd rowStörung. (Quelle: zuginfo.nrw)
3rd rowStörung. (Quelle: zuginfo.nrw)
4th rowStörung. (Quelle: zuginfo.nrw)
5th rowInformation

Common Values

ValueCountFrequency (%)
Information 219535
 
7.9%
Störung 105210
 
3.8%
Bauarbeiten 88958
 
3.2%
Information. (Quelle: zuginfo.nrw) 73213
 
2.6%
Bauarbeiten. (Quelle: zuginfo.nrw) 64167
 
2.3%
Störung. (Quelle: zuginfo.nrw) 25423
 
0.9%
Großstörung 7326
 
0.3%
(Missing) 2201357
79.0%

Length

2024-11-17T19:59:47.168208image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-17T19:59:47.279951image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
information 292748
32.2%
quelle 162803
17.9%
zuginfo.nrw 162803
17.9%
bauarbeiten 153125
16.8%
störung 130633
14.4%
großstörung 7326
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n 1202186
 
12.5%
o 755625
 
7.8%
r 753961
 
7.8%
e 631856
 
6.6%
u 616690
 
6.4%
i 608676
 
6.3%
a 598998
 
6.2%
t 583832
 
6.1%
f 455551
 
4.7%
l 325606
 
3.4%
Other values (18) 3111108
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 9644089
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 1202186
 
12.5%
o 755625
 
7.8%
r 753961
 
7.8%
e 631856
 
6.6%
u 616690
 
6.4%
i 608676
 
6.3%
a 598998
 
6.2%
t 583832
 
6.1%
f 455551
 
4.7%
l 325606
 
3.4%
Other values (18) 3111108
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 9644089
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 1202186
 
12.5%
o 755625
 
7.8%
r 753961
 
7.8%
e 631856
 
6.6%
u 616690
 
6.4%
i 608676
 
6.3%
a 598998
 
6.2%
t 583832
 
6.1%
f 455551
 
4.7%
l 325606
 
3.4%
Other values (18) 3111108
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 9644089
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 1202186
 
12.5%
o 755625
 
7.8%
r 753961
 
7.8%
e 631856
 
6.6%
u 616690
 
6.4%
i 608676
 
6.3%
a 598998
 
6.2%
t 583832
 
6.1%
f 455551
 
4.7%
l 325606
 
3.4%
Other values (18) 3111108
32.3%

canceled
Boolean

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
False
2573834 
True
 
211355
ValueCountFrequency (%)
False 2573834
92.4%
True 211355
 
7.6%
2024-11-17T19:59:47.383981image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

info_present
Boolean

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
False
2201357 
True
583832 
ValueCountFrequency (%)
False 2201357
79.0%
True 583832
 
21.0%
2024-11-17T19:59:47.457089image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

transformed_info_message
Categorical

High correlation  Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.2 MiB
No message
2201357 
Information
292748 
Bauarbeiten
 
153125
Störung
 
130633
Großstörung
 
7326

Length

Max length11
Median length10
Mean length10.022009
Min length7

Characters and Unicode

Total characters27913190
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo message
2nd rowNo message
3rd rowNo message
4th rowNo message
5th rowNo message

Common Values

ValueCountFrequency (%)
No message 2201357
79.0%
Information 292748
 
10.5%
Bauarbeiten 153125
 
5.5%
Störung 130633
 
4.7%
Großstörung 7326
 
0.3%

Length

2024-11-17T19:59:47.571372image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-17T19:59:47.655178image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
no 2201357
44.1%
message 2201357
44.1%
information 292748
 
5.9%
bauarbeiten 153125
 
3.1%
störung 130633
 
2.6%
großstörung 7326
 
0.1%

Most occurring characters

ValueCountFrequency (%)
e 4708964
16.9%
s 4410040
15.8%
a 2800355
10.0%
o 2794179
10.0%
m 2494105
8.9%
g 2339316
8.4%
N 2201357
7.9%
2201357
7.9%
n 876580
 
3.1%
r 591158
 
2.1%
Other values (11) 2495779
8.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27913190
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 4708964
16.9%
s 4410040
15.8%
a 2800355
10.0%
o 2794179
10.0%
m 2494105
8.9%
g 2339316
8.4%
N 2201357
7.9%
2201357
7.9%
n 876580
 
3.1%
r 591158
 
2.1%
Other values (11) 2495779
8.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27913190
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 4708964
16.9%
s 4410040
15.8%
a 2800355
10.0%
o 2794179
10.0%
m 2494105
8.9%
g 2339316
8.4%
N 2201357
7.9%
2201357
7.9%
n 876580
 
3.1%
r 591158
 
2.1%
Other values (11) 2495779
8.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27913190
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 4708964
16.9%
s 4410040
15.8%
a 2800355
10.0%
o 2794179
10.0%
m 2494105
8.9%
g 2339316
8.4%
N 2201357
7.9%
2201357
7.9%
n 876580
 
3.1%
r 591158
 
2.1%
Other values (11) 2495779
8.9%

clear_station_name
Unsupported

Missing  Rejected  Unsupported 

Missing2785189
Missing (%)100.0%
Memory size21.2 MiB

Interactions

2024-11-17T19:59:21.348083image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:40.394262image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:44.581422image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:49.417651image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:55.348323image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:59.534663image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:03.614442image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:07.702014image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:12.678961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:17.814411image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:21.746704image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:40.847062image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:44.985572image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:49.924953image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:55.813378image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:59.963376image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:04.052072image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:08.324192image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:13.428010image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:18.165748image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:22.132190image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:41.303465image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:45.405145image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:50.403889image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:56.247756image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:00.381867image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:04.485084image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:08.786341image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:14.162100image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:18.505799image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:22.526675image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:41.736754image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:45.876814image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:50.853510image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:56.678067image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:00.815546image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:04.905168image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:09.242767image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:14.992085image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:18.826477image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:22.886217image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:42.169466image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:46.464205image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:51.322737image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:57.112983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:01.229638image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:05.340927image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:09.709256image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:15.458969image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:19.150272image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:23.235929image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:42.602639image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:46.973575image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:51.942268image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:57.571802image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:01.652523image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:05.742094image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:10.158031image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:15.892192image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:19.458518image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:23.615042image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:43.050342image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:47.470669image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:52.748997image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:58.024614image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:02.092421image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:06.150393image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:10.581701image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:16.340896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:19.780060image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:23.992890image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:43.485692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:48.124866image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:53.479172image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:58.450052image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:02.528530image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:06.570149image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:11.025034image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:16.764974image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:20.168307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:24.386385image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:43.821853image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:48.498581image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:54.046082image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:58.781458image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:02.837023image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:06.890111image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:11.344622image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:17.098034image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:20.554371image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:24.802166image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:44.151477image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:48.897645image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:54.676853image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:58:59.112131image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:03.167620image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:07.223827image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:11.936609image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:17.455945image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-17T19:59:20.949721image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-17T19:59:47.732207image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
IBNRID_BaseID_Timestamparrival_delay_mcanceleddeparture_delay_minfoinfo_presentlatlongstarting_station_IBNRstop_numbertransformed_info_messagezip
IBNR1.000-0.0040.001-0.1170.091-0.1210.3230.2120.2540.4720.6620.1380.148-0.497
ID_Base-0.0041.000-0.001-0.0010.003-0.0010.0310.0120.0020.003-0.009-0.0020.0130.005
ID_Timestamp0.001-0.0011.000-0.0240.021-0.0230.0800.0380.0050.0030.0010.0020.039-0.006
arrival_delay_m-0.117-0.001-0.0241.0000.0360.8230.0270.012-0.264-0.110-0.1470.3170.0100.242
canceled0.0910.0030.0210.0361.0000.0210.0710.0120.0710.0610.0730.3330.0220.067
departure_delay_m-0.121-0.001-0.0230.8230.0211.0000.0270.012-0.283-0.115-0.1580.2730.0100.261
info0.3230.0310.0800.0270.0710.0271.0001.0000.5280.5530.3010.1131.0000.571
info_present0.2120.0120.0380.0120.0120.0121.0001.0000.1720.1920.1890.0991.0000.224
lat0.2540.0020.005-0.2640.071-0.2830.5280.1721.0000.2140.2670.0020.202-0.540
long0.4720.0030.003-0.1100.061-0.1150.5530.1920.2141.0000.4400.0570.202-0.260
starting_station_IBNR0.662-0.0090.001-0.1470.073-0.1580.3010.1890.2670.4401.0000.1400.133-0.574
stop_number0.138-0.0020.0020.3170.3330.2730.1130.0990.0020.0570.1401.0000.060-0.043
transformed_info_message0.1480.0130.0390.0100.0220.0101.0001.0000.2020.2020.1330.0601.0000.234
zip-0.4970.005-0.0060.2420.0670.2610.5710.224-0.540-0.260-0.574-0.0430.2341.000

Missing values

2024-11-17T19:59:25.719740image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-17T19:59:29.435369image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-17T19:59:37.708882image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

ID_BaseID_Timestampstop_numberlinestarting_station_IBNRcityziplast_stationIBNRlonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfocanceledinfo_presenttransformed_info_messageclear_station_name
0-206513755758489341424070822371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-08 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
1-206513755758489341424070922371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-09 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
2-206513755758489341424071022371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-10 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
3-206513755758489341424071122371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-11 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
4-206513755758489341424071222371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-12 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
5-206513755758489341424071322371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-13 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
6-206513755758489341424071422371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-14 22:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
7-356145467381100390124070821371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-08 21:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
8-356145467381100390124070921371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-09 21:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
9-356145467381100390124071021371298000001Aachen52064NaN8000001.06.09149950.7678NaN2024-07-10 21:37:00NaNNaN0.00.0NaNTrueFalseNo messageNaN
ID_BaseID_Timestampstop_numberlinestarting_station_IBNRcityziplast_stationIBNRlonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfocanceledinfo_presenttransformed_info_messageclear_station_name
2785179623429781750960466624071120122708098360Bürstadt68642frankfurt-niederrad8002050.09.70395847.552786NaNNaNNaNNaNNaNNaNNaNFalseFalseNo messageNaN
2785180623429781750960466624071120123708098360Bürstadt68642walldorf(hess)8006175.08.58081150.0013392024-07-11 20:24:002024-07-11 20:25:002024-07-11 20:25:002024-07-11 20:25:001.00.0InformationFalseTrueInformationNaN
2785181623429781750960466624071120124708098360Bürstadt68642mörfelden8004065.07.66610547.615929NaNNaNNaNNaNNaNNaNNaNFalseFalseNo messageNaN
2785182623429781750960466624071120125708098360Bürstadt68642groß gerau-dornberg8002386.08.49470949.9122792024-07-11 20:33:002024-07-11 20:34:002024-07-11 20:33:002024-07-11 20:34:000.00.0InformationFalseTrueInformationNaN
2785183623429781750960466624071120126708098360Bürstadt68642riedstadt-goddelau8000126.08.48918749.8332302024-07-11 20:39:002024-07-11 20:39:002024-07-11 20:39:002024-07-11 20:40:000.01.0InformationFalseTrueInformationNaN
2785184623429781750960466624071120127708098360Bürstadt68642stockstadt(rhein)8005740.08.12573849.196735NaNNaNNaNNaNNaNNaNNaNFalseFalseNo messageNaN
2785185623429781750960466624071120128708098360Bürstadt68642biebesheim8000951.08.47397849.7819772024-07-11 20:45:002024-07-11 20:45:002024-07-11 20:45:002024-07-11 20:45:000.00.0InformationFalseTrueInformationNaN
2785186623429781750960466624071120129708098360Bürstadt68642gernsheim8002249.012.24245251.822244NaNNaNNaNNaNNaNNaNNaNFalseFalseNo messageNaN
27851876234297817509604666240711201210708098360Bürstadt68642groß-rohrheimNaN12.07800650.882649NaNNaNNaNNaNNaNNaNNaNFalseFalseNo messageNaN
27851886234297817509604666240711201211708098360Bürstadt68642biblis8000503.08.45041349.6888812024-07-11 20:56:002024-07-11 20:57:002024-07-11 20:56:002024-07-11 20:57:000.00.0InformationFalseTrueInformationNaN